https://xuanshay.github.io

Project: Popular Video Game Trend

Group Memebers: Yuxuan Zhang, Haowen Weng

With the rapid development of electronic information technology, electronic games have become one of the most popular forms of entertainment. Especially in recent years, the concept of e-sports has been recognized by the society to a certain extent. In the circle of young people, gaming-related topics and discussions begin to occupy a certain proportion in the social process. In this context, we are gradually interested in the video game industry, hoping to study the current mainstream trend of video games through scientific methods, and hope to trigger the thinking of game industry practitioners.


Contents:


Purpose and Questions:

As we said, trends are our core research direction, and we break them down into several step-by-step discussions. (If you don't want to read this part, you can skip and read the three questions at the end.)

First is the macro market. How has the market of the entire game industry been distributed since the advent of video games in the last century? Because of our size, we don't want to consider the recent boom in the mobile game market, nor do we want to focus on trends that go back too far, so our first question is: How have the major game makers fared in the 21st century? This will help us understand players' choice of major manufacturers for our games.

Secondly the players' preferences for game types. Given the cultural differences leading some differences in how players respond to different types of games, we will explore the preferences of players in each of the four major regions where games are sold: North America, Europe, Japan, and others.

Thirdly the trend of individual games. Although people's choices vary for a variety of reasons, there will always be individual games that are popular, and there will always be similarities and differences between those games. We hope that studying the average lifespan of such games will provide a relatively reliable standard for those in the game industry.


Datasets:


Initialization


Data Generalization, Extraction, Transfrom, and Load

Data Set 1: Video Games Slaes

This dataset is generated by a scrape of VGChartz. It contains a list of video games with sales greater than 100,000 copies in worldwide. It contains the data of sales in different region we mention: (NA, EU, JP,others), game name, platform, release year, genre and publisher

Notice: 1.) Sales are in millions. 2.) Data collected from 2017.

This is a quick peek on how it looks like

Data Set 2: Steam database

First, we scrap and generate the dataframe using several pandas functions

Data Set 3: Previous players count for games on steam


Analyzing datasets and getting into the problems

1. Game production companies' preformance in 21th century.

----Which game producers are player's favorite

To find out the main stream tendency of gaming industries, we better first investigate publishing companies, Let's overview the performance of all the manufacturers since 2000. Here we are showing the games releasing after 2000

That's quite a lot, now we are grouping these games by their producer, and then let's see those producer companies with more than 50 millions sales of games. We will using a pie chart and noticing how the market shares are taken.

This chart is giving us a message that game industry company are ralatively static, since we can see how there are few companies take a big share of the video game market.

However, this chart conclude all sales from 2000 till now, to investigate the performance macroscopically, we want to know the flowing of sales of game producing over periods.

Let's set the period to 4 years. We can divide the periods into 2000-2003, 2004-2007, 2008-2011, 2012-2016, respectively called q1,q2,q3,q4, for each period, we will conclude the top 10 producers in the market.

Now we can view the top 10 from each period according to order

Next step we will join each table and gets the flow of top 10 publishers over the 4 periods.

Now, lets get an visual aid on how Top 10 game publishers' sales on 21th century look like (y-axis is sales by million copy)

Now we can conclude our thinking on our first quesition:

It's clear that over the years, some game companies have managed to hold a firm part on the video game market, and over the years, only three companies have been overtaken by others: Atari(after 2004), THQ and Konami(both after 2012). On the other hand, Electric Arts and Nintendo stayed at the top 2 all the time with the rest still remaining on this list.

It is conceivable that some opinions have been formed in the minds of players, such as that XXX's games should be excellent, and these opinions will occupy the mainstream in a short time.

2. Players' taste in different regions

--what would player prefer to play under different culture and environment

To do this, we need to fisrt analysis the sales of different genre within our 4 main regions of games selling (North America, Europe, Japen, Others). Here we are using pie chart again, this is good for us to peek the faviourite game genre (like what % is one game genre took away and how is it compared to other genre. )

After seeing every region's distribution, we have to standardized value of each type of games within a region, and then we can make a table such that joining all the data, then we can also calculate the world_average data for this in order to seek out the main trend worldwide.

This table looks good, now let's behold the trend of all game genre in this graph belows.

Now we can conclude our thinking on our second quesition:

That's very clear that what type of game are prefered by players from differnt area.

1) Action Game(0.194) seems to be generally the most favored genre all around the world, second is sports(0.146) and third is role-playing(0.125)
2) Most minority types of game: Adventure(0.0280), Puzzle(0.0273), Strategy(0.0217). this might due to the relativgely hardcore playing style which not really mean to be a way to entertain for many consumers.
3) Japaness people are indeed obsessed with the Role-playing (that long green bar), this might lead by the well-development of JRPG and the dense local culture

3. The lifetime of a game

Find out the data we need from different datasets

For this part, we are going to find the ranking of top 100 games from 2016 till present.
Afterwards, we will see the change on rankings and possibly find out a trend and the average years a game can be considered popular.
In this part popular is defined as the number of players is in top 100

A little formatting stuff for the scraped dataframe

Collect data from all database

For previous ranking, we are ranking them based on the average players.

2020 Ranking

2019 Ranking

2018 Ranking

2017 Ranking

2016 Ranking

Generating a bump chart from top_game

We found the way to generate a bump chart in pandas from the website: https://stackoverflow.com/questions/68095438/how-to-make-a-bump-chart

Since the graph for top 100 games is too messy, we then focus on top 20 instead
To have a closer look for the top 20 ranking:

Here, we are only generating the data with top 20 games, so we first need to clean the dataframe, deleting games and rankings after 20.

From the graph, we can see how the ranking change from 2016 till present.
We will then utilize the database to calculate the average lifetime of a game. With the conclusion we get from the project, it would be meaningful for video game companies to set a timeline for the production and announcement of their products

Find the lifetime of a game:
here, we define the lifetime of a game as the time it stays in the list of top 100 games in the world.

We can see that the average lifetime of a game is 4.15 years.

Regression model for lifetime

average ranking of all games that has been on the top ranking since 2016 till present:

So, it seems like there doesn't exist a relationship between the game ranking and its year since publish. Thus we failed to build a regression line between the relation of ranking and year. If we didn't consider the data of 2021(since there are many unpredictable side effect to the game production industry brought by the outbreak of Covid-19). It shows that the ranking of the game are roughtly the same. This implies that the if a game is popular at first, it is really likely to stay popular within 5 years. This is a good signal since the average year of a game on the top ranking is only 4 years.